Identify High-Quality Protein Structural Models by Enhanced K-Means
نویسندگان
چکیده
Background. One critical issue in protein three-dimensional structure prediction using either ab initio or comparative modeling involves identification of high-quality protein structural models from generated decoys. Currently, clustering algorithms are widely used to identify near-native models; however, their performance is dependent upon different conformational decoys, and, for some algorithms, the accuracy declines when the decoy population increases. Results. Here, we proposed two enhanced K-means clustering algorithms capable of robustly identifying high-quality protein structural models. The first one employs the clustering algorithm SPICKER to determine the initial centroids for basic K-means clustering (SK-means), whereas the other employs squared distance to optimize the initial centroids (K-means++). Our results showed that SK-means and K-means++ were more robust as compared with SPICKER alone, detecting 33 (59%) and 42 (75%) of 56 targets, respectively, with template modeling scores better than or equal to those of SPICKER. Conclusions. We observed that the classic K-means algorithm showed a similar performance to that of SPICKER, which is a widely used algorithm for protein-structure identification. Both SK-means and K-means++ demonstrated substantial improvements relative to results from SPICKER and classical K-means.
منابع مشابه
Rice Classification and Quality Detection Based on Sparse Coding Technique
Classification of various rice types and determination of its quality is a major issue in the scientific and commercial fields associated with modern agriculture. In recent years, various image processing techniques are used to identify different types of agricultural products. There are also various color and texture-based features in order to achieve the desired results in this area. In this ...
متن کاملA statistical approach to classify Skype traffic
Abstract- Skype is one of the most powerful and high-quality chat tools that allows its users to use of many services such as: transferring audio, sending messages, video conferencing and audio for free. Skype traffic has a lot of Internet traffic. Hence, Internet service providers need to identify traffic to do the quality of service and network management. On the other hand, Skype developers ...
متن کاملAn Enhanced K-Means Algorithm for Water Quality Analysis of The Haihe River in China
The increase and the complexity of data caused by the uncertain environment is today's reality. In order to identify water quality effectively and reliably, this paper presents a modified fast clustering algorithm for water quality analysis. The algorithm has adopted a varying weights K-means cluster algorithm to analyze water monitoring data. The varying weights scheme was the best weighting i...
متن کاملModifying PIARC’s Linear Model of Accident Severity Index to Identify Roads' Accident Prone Spots to Rehabilitate Pavements Considering Nonlinear Effects of the Traffic Volume
Pavement rehabilitation could affect the accident severity index (ASI) since restoration measures means more safety for road users. No research or project has been carried out to identify hazard points to build a linear model based on crash severity index. One of the very popular accident severity index models used in all countries is based on linear models to rehabilitate pavements and this pa...
متن کاملA Modified Empirical Path Loss Model for 4G LTE Network in Lagos, Nigeria
The quality of signal at a particular location is essential to determine the performance of mobile system. The problem of poor network in Lagos, Nigeria needs to be addressed especially now that the attention is toward online learning and meetings. Existing empirical Path Loss (PL) models designed elsewhere are not appropriate for predicting the 4G Long-Term Evolution (LTE) signal in Nigeria. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2017 شماره
صفحات -
تاریخ انتشار 2017